On Label Stream Partition for Efficient Holistic Twig Join
نویسندگان
چکیده
Label stream partition is a useful technique to reduce the input I/O cost of holistic twig join by pruning useless streams beforehand. The Prefix Path Stream (PPS) partition scheme is effective for non-recursive XML documents, but inefficient for deep recursive XML documents due to the high CPU cost of pruning and merging too many streams for some twig pattern queries involving recursive tags. In this paper, we propose a general stream partition scheme called Recursive Path Stream (RPS), to control the total number of streams while providing pruning power. In particular, each recursive path in RPS represents a set of prefix paths which can be recursively expanded from the recursive path. We present the algorithms to build RPS scheme and prune RPS streams for queries. We also discuss the adaptability of RPS and provide a framework for performance tuning with general RPS based on different application requirements.
منابع مشابه
Efficient Processing of Ordered XML Twig Pattern
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. Holistic twig join algorithm has showed its superiority over binary decompose based approach due to efficient reducing intermediate results. The existing holistic join algorithms, however, cannot deal with ordered twig queries. A straightforward approach that first match...
متن کاملTwigStackList-: A Holistic Twig Join Algorithm for Twig Query with Not-Predicates on XML Data
As business and enterprises generate and exchange XML data more often, there is an increasing need for searching and querying XML data. A lot of researches have been done to match XML twig queries. However, as far as we know, very little work has examined the efficient processing of XML twig queries with not-predicates. In this paper, we propose a novel holistic twig join algorithm, called Twig...
متن کاملHolistic Twig Joins on Indexed XML Documents
Finding all the occurrences of a twig pattern specified by a selection predicate on multiple elements in an XML document is a core operation for efficient evaluation of XML queries. Holistic twig join algorithms were proposed recently as an optimal solution when the twig pattern only involves ancestordescendant relationships. In this paper, we address the problem of efficient processing of holi...
متن کاملTR A6/05 From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching
Finding all the occurrences of a twig pattern in an XML database is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process a twig query based on region encoding labeling scheme. While region encoding supports efficient determination of ancestor-descendant (or parent-child) relationship between two elements, we observe that the information ...
متن کاملTRACK : A Novel XML Join Algorithm for Efficient Processing Twig Queries
In order to find all occurrences of a tree/twig pattern in an XML database, a number of holistic twig join algorithms have been proposed. However, most of these algorithms focus on identifying a larger query class or using a novel label scheme to reduce I/O operations, and ignore the deficiency of the root-to-leaf strategy. In this paper, we propose a novel twig join algorithm called Track, whi...
متن کامل